Problem Statement

Business Context

Renewable energy sources play an increasingly important role in the global energy mix, as the effort to reduce the environmental impact of energy production increases.

Out of all the renewable energy alternatives, wind energy is one of the most developed technologies worldwide. The U.S Department of Energy has put together a guide to achieving operational efficiency using predictive maintenance practices.

Predictive maintenance uses sensor information and analysis methods to measure and predict degradation and future component capability. The idea behind predictive maintenance is that failure patterns are predictable and if component failure can be predicted accurately and the component is replaced before it fails, the costs of operation and maintenance will be much lower.

The sensors fitted across different machines involved in the process of energy generation collect data related to various environmental factors (temperature, humidity, wind speed, etc.) and additional features related to various parts of the wind turbine (gearbox, tower, blades, break, etc.).

Objective

“ReneWind” is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. They have shared a ciphered version of the data, as the data collected through sensors is confidential (the type of data collected varies with companies). Data has 40 predictors, 20000 observations in the training set and 5000 in the test set.

The objective is to build various classification models, tune them, and find the best one that will help identify failures so that the generators could be repaired before failing/breaking to reduce the overall maintenance cost. The nature of predictions made by the classification model will translate as follows:

It is given that the cost of repairing a generator is much less than the cost of replacing it, and the cost of inspection is less than the cost of repair.

“1” in the target variables should be considered as “failure” and “0” represents “No failure”.

Data Description

The data provided is a transformed version of the original data which was collected using sensors.

Both the datasets consist of 40 predictor variables and 1 target variable.

Installing and Importing the necessary libraries

Note:

Loading the Data

Data Overview

Exploratory Data Analysis

Utility functions

Univariate analysis

Bivariate Analysis

Data Preprocessing

Model Building

Model Evaluation Criterion

Write down the model evaluation criterion with rationale

Initial Model Building (Model 0)

Utility functions

Classification report

Model Performance Improvement

Model 1

SGD Optimizer with Epoch 50

Lets check the model performance on training and validation data.

Precision and recall improved by increasing the epoch.

Model 2

SGD Optimizer with Batch size 32 and Epoch 10

Model 3

Keeping batch size 32 and increasing the epoch to 50

Model 4

Adam optimizer with lr 0.0001(Reducing the learning rate to 0.0001)

Model 5

Adam optimizer with lr 0.0001 and Dropout 0.2

Model 6

Adam optimizer with lr 0.0001 and Dropout 0.2 with He Weight initialisation

Model Performance Comparison and Final Model Selection

Now, in order to select the final model, we will compare the performances of all the models for the training and validation sets.

Training Performance Comparison

Validation Performance Comparison

Now, let's check the performance of the final model on the test set.

Actionable Insights and Recommendations

Business Recommendations

Summary


The final model (Model 5) provides a strong balance of recall and precision to support cost-effective predictive maintenance. By operationalizing this model, ReneWind can substantially reduce replacement costs, improve uptime, and optimize maintenance resources, positioning themselves as a leader in efficient renewable energy operations.